Module 01

Module 01 portfolio check

The first of your second level headers (##) is to be used for the portfolio content checks. The Module 01 portfolio check has been built for you directly into this template, but will also be available as a stand-alone markdown document available on the MICB425 GitHub so that you know what is required in each module section in your portfolio. The completion status and comments will be filled in by the instructors during portfolio checks when your current portfolios are pulled from GitHub.

  • Installation check
    • Completion status:
    • Comments:
  • Portfolio repo setup
    • Completion status:
    • Comments:
  • RMarkdown Pretty html Challenge
    • Completion status:
    • Comments:
  • Evidence worksheet_01
    • Completion status:
    • Comments:
  • Evidence worksheet_02
    • Completion status:
    • Comments:
  • Evidence worksheet_03
    • Completion status:
    • Comments:
  • Problem Set_01
    • Completion status:
    • Comments:
  • Problem Set_02
    • Completion status:
    • Comments:
  • Writing assessment_01
    • Completion status:
    • Comments:
  • Additional Readings
    • Completion status:
    • Comments

Data science Friday

The remaining second level headers (##) are for separating data science Friday, regular course, and project content. In this module, you will only need to include data science Friday and regular course content; projects will come later in the course.

Installation check

Third level headers (###) should be used for links to assignments, evidence worksheets, problem sets, and readings, as seen here.

Use this space to include your installation screenshots.
Windows GitBash Terminal

RStudio

RStudio

GitHub homepage

GitHub homepage

Portfolio repo setup

Detail the code you used to create, initialize, and push your portfolio repo to GitHub. This will be helpful as you will need to repeat many of these steps to update your porfolio throughout the course.

starting from after registering for a new GitHub account
1. git init
2. git add .
3. git commit -m “First commit”
4. git remote add origin https://github.com/judyban/MICB425_portfolio
5. git remote -v
6. git push -u origin master

Repeat steps 2, 3 and git push to push new materials into the repository

RMarkdown pretty html challenge

The following assignment is an exercise for the reproduction of this .html document using the RStudio and RMarkdown tools we’ve shown you in class. Hopefully by the end of this, you won’t feel at all the way this poor PhD student does. We’re here to help, and when it comes to R, the internet is a really valuable resource. This open-source program has all kinds of tutorials online.
http://phdcomics.com/ Comic posted 1-17-2018

Challenge Goals

The goal of this R Markdown html challenge is to give you an opportunity to play with a bunch of different RMarkdown formatting. Consider it a chance to flex your RMarkdown muscles. Your goal is to write your own RMarkdown that rebuilds this html document as close to the original as possible. So, yes, this means you get to copy my irreverant tone exactly in your own Markdowns. It’s a little window into my psyche. Enjoy =)

hint: go to the PhD Comics website to see if you can find the image above
If you can’t find that exact image, just find a comparable image from the PhD Comics website and include it in your markdown

Here’s a header!

Let’s be honest, this header is a little arbitrary. But show me that you can reproduce headers with different levels please. This is a level 3 header, for your reference (you can most easily tell this from the table of contents).

Another header, now with maths

Perhaps you’re already really confused by the whole markdown thing. Maybe you’re so confused that you’ve forgotton how to add. Never fear! A calculator R is here:

1231521+12341556280987
## [1] 1.234156e+13

Table Time

Or maybe, after you’ve added those numbers, you feel like it’s about time for a table! I’m going to leave all the guts of the coding here so you can see how libraries (R packages) are loaded into R (more on that later). It’s not terribly pretty, but it hints at how R works and how you will use it in the future. The summary function used below is a nice data exploration function that you may use in the future.

library(knitr)
kable(summary(cars),caption="I made this table with kable in the knitr package library")
I made this table with kable in the knitr package library
speed dist
Min. : 4.0 Min. : 2.00
1st Qu.:12.0 1st Qu.: 26.00
Median :15.0 Median : 36.00
Mean :15.4 Mean : 42.98
3rd Qu.:19.0 3rd Qu.: 56.00
Max. :25.0 Max. :120.00

And now you’ve almost finished your first RMarkdown! Feeling excited? We are! In fact, we’re so excited that maybe we need a big finale eh? Here’s ours! Include a fun gif of your choice!

Origins and Earth Systems

Evidence worksheet 01

The template for the first Evidence Worksheet has been included here. The first thing for any assignment should link(s) to any relevant literature (which should be included as full citations in a module references section below).

You can copy-paste in the answers you recorded when working through the evidence worksheet into this portfolio template.

As you include Evidence worksheets and Problem sets in the future, ensure that you delineate Questions/Learning Objectives/etc. by using headers that are 4th level and greater. This will still create header markings when you render (knit) the document, but will exclude these levels from the Table of Contents. That’s a good thing. You don’t’ want to clutter the Table of Contents too much.

Whitman et al 1998

Learning objectives

Describe the numerical abundance of microbial life in relation to ecology and biogeochemistry of Earth systems.

General questions

  • What were the main questions being asked?
    • 1.what is the estimated abundance of prokaryotes in various reservoirs on Earth?
    • 2.what is the estimated abundance of the primary nutrients content of prokaryotes in reservoirs?
  • What were the primary methodological approaches used?
    +Pooled data from literature, extrapolation from averages of various characteristics of prokaryotes such as volume and density, and from environments such as porosity.Carbon content were found from average content of carbon produced per prokaryote.

  • Summarize the main results or findings.
    • Earth’s prokaryotes contain 10-fold more nutrients than plants and represent the largest pool of these nutrients in living organisms. The total amount of prokaryotic carbon is 60-100% of the estimated total carbon in plants. The largest reservoir of prok are the ocean, soil, oceanic and terrestrial subsurfaces, and the ocean had the largest turnover of prokaryotes. The large population size and rapid growth of prokaryotes can introduce mutations that leads to large capacity for genetic diversity.
  • Do new questions arise from the results?
    • Current data has a lot of uncertainty with the methods, used, such as in the estimate of prokaryotes abndance in groundwater.
      There were also questions arised from these uncertainties, such as how does prokaryotic turnover affect the nutrient cycles? The genetic diveresity in prokaryotes is vast and there has been limiting understanding, the authors also is uncertain about how all the metabolic pathways of microbes can fit into the currently existing nutrient cycles.
  • Were there any specific challenges or advantages in understanding the paper (e.g. did the authors provide sufficient background information to understand experimental logic, were methods explained adequately, were any specific assumptions made, were conclusions justified based on the evidence, were the figures or tables useful and easy to understand)?
    • Challenges:
      • the authors did not provide enough background on prokaryote abundance on Earth to readers who lack the background, but rather jumped into the numbers.
      • Minimal explanation of their calculation and methods.
    • Assumptions:
      • The authors were aware of the extrapolation of data from literature, but lacked criticism against the validity of values from these literatures.
      • The authors were also aware areas of conflict in data presented from different literatures.
    • Conclusions:
      • The results were impactful, there was evidence presented from a variety of primary literature to support the conclusion.
      • The figures and tables were straightforward and easy to understand.

Problem set 01

Whitman et al 1998

Learning objectives:

Describe the numerical abundance of microbial life in relation to the ecology and biogeochemistry of Earth systems.

Specific questions:

  • What are the primary prokaryotic habitats on Earth and how do they vary with respect to their capacity to support life? Provide a breakdown of total cell abundance for each primary habitat from the tables provided in the text.
Primary Habitats Total Cell Abundance
Open ocean 1.2 x 10^29
Soil 2.6 x 10^29
Subsurface 3.8 x 10^30
Oceanic subsurface 3.5 x 10^30
Terrestrial subsurface 0.25 x 10^30-2.3 x 10^30
  • What is the estimated prokaryotic cell abundance in the upper 200 m of the ocean and what fraction of this biomass is represented by marine cyanobacterium including Prochlorococcus? What is the significance of this ratio with respect to carbon cycling in the ocean and the atmospheric composition of the Earth?

    • The estimated prokaryotic cell abundance in the upper 200 m of the ocean is 3.6 x 1028, cell density is 5 x 105 CFUs/mL. Cyanobacteria has density of 4 x 104 cells/ml. 8% of total prokaryotic cell abundance is represented by marine cyanobacterium including Procholorococcus. Since they are autotrophs, they are the major players in driving one part of the carbon cycle by assimilating inorganic carbon into organic carbon through photosynthesis. They also drive the food web by being at the bottom of it and serving as primary producers.
  • What is the difference between an autotroph, heterotroph, and a lithotroph based on information provided in the text?

    • Autotroph – “self-nourishing” – uses inorganic carbon to produce complex organic carbon as a source of carbon for other organotrophs (fixing CO2 into biomass).

    • Heterotroph – uses organic carbon assimilated by autotrophs as sources of carbon. Assimilate organic carbon.

    • Lithotroph – technically obtains electron sources from inorganic chemicals, they can use material other than inorganic carbon to obtain reducing agents. Use inorganic substrates.

  • Based on information provided in the text and your knowledge of geography what is the deepest habitat capable of supporting prokaryotic life? What is the primary limiting factor at this depth?

    • The deepest habitat is oceanic subsurface that goes below ground up to 4 km. At increasing depth, high temperature of 125C, becomes a primary limiting factor to prokaryotic life. Change in temperature is about an increase of 22C/km.
  • Based on information provided in the text your knowledge of geography what is the highest habitat capable of supporting prokaryotic life? What is the primary limiting factor at this height?
    • The highest place prokaryotes can be found is 77km, but most bacteria found up there were transferred there instead of native inhabitants. Realistic habitats is around 20km. I predict that most of them would be spore-forming bacteria.
      While the paper did not provide a clear answer to the limiting factor, a thinning of atmospheric gasses at high altitudes limits the abundance of nutrients that can be provided for prokaryote uptake. There is also a lack of moisture in the upper atmosphere that could lead to desiccation. UV is also strong.
  • Based on estimates of prokaryotic habitat limitation, what is the vertical distance of the Earth’s biosphere measured in km?
    • 20km (atmospheric) + 4km (subsurface) = 24km
  • How was annual cellular production of prokaryotes described in Table 7 column four determined? (Provide an example of the calculation)
    • (Population size) x (# of turnovers/year) = cells/year
    • Example Marine heterotrophs:
      • 3.6 x 10^28 cells x (365 days per year/16 turnovers) = 8.2 x 10^29 cells/year
  • What is the relationship between carbon content, carbon assimilation efficiency and turnover rates in the upper 200m of the ocean? Why does this vary with depth in the ocean and between terrestrial and marine habitats?
    • The estimate of prokaryotic carbon in the upper 200m of ocean environment sets limits for the turnover rates in these population. In turn, the turn over rate can be determined based the carbon assimilation efficiency necessary for the carbon estimated content in the environment. This value varies with depth and different habitats because different environments support unique community of prokaryotes that have varying productivity and carbon assimilation efficiency, which results in different turnover rates.
  • How were the frequency numbers for four simultaneous mutations in shared genes determined for marine heterotrophs and marine autotrophs given an average mutation rate of 4 x 10-7 per DNA replication? (Provide an example of the calculation with units. Hint: cell and generation cancel out)
    • (4x10(-7))4 x 8.2x10^29 = 2.1x10^4 mutations/year
  • Given the large population size and high mutation rate of prokaryotic cells, what are the implications with respect to genetic diversity and adaptive potential? Are point mutations the only way in which microbial genomes diversify and adapt?
    • Large population size and high mutation rate can be considered a major source of genetic diversity and one of the essential factors that allows prokaryotes to first adapt , then evolve in different environments. While point mutations are common, they are not the only way to adapt. Other methods include horizontal transfer of genetic material between different prokaryotes.
  • What relationships can be inferred between prokaryotic abundance, diversity, and metabolic potential based on the information provided in the text?
    • Prokaryote abundance creates an opportunity for frequent mutations and genetic material exchanges that allows adaptation of prokaryotes to different habitats. Adaptation to unique environments will eventually lead to divergence in metabolic potential and ultimately evolution and diversity.

Evidence Worksheet_02 Life and the Evolution of Earths Atmosphere

Nisbet et al 1998

Learning objectives:

Comment on the emergence of microbial life and the evolution of Earth systems

  • Indicate the key events in the evolution of Earth systems at each approximate moment in the time series. If times need to be adjusted or added to the timeline to fully account for the development of Earth systems, please do so.

    • 4.6 billion years ago
      • formation of the solar system
      • Inner planets recieved water vapour and carbon
      • heavy meteroite bombardment
    • 4.5 billion years ago
      • moon was formed, it gave Earth its spin & tilt, day & night cycles, seasons
    • 4.4 billion years ago
      • formation of Zircon (oldest mineral)
    • 4.2 billion years ago

    • 4.1 billion years ago
      • evidence of life present in Zircon
    • 4.0 billion years ago
      • oldest rock, Acasta gneiss
      • evidencec of plate subduction
    • 3.8 billion years ago
      • meteorite bombardment halted
      • sea water chmistry stabilized
      • speculated to be the beginning of life
      • oldest known water-lain sedimentary rock found
    • 3.75 billion years ago
      • early methanogenesis?
    • 3.5 billion years ago
      • photosynthetic cyanobacteria evolved
      • evidence of Rubisco signature indicates global oxygenic photosynthesis
      • sulphate detected in rocks, signifies localized non-reducing conditions
    • 3.0 billion years ago
      • well-developed stromatolites
      • global glaciation
    • 2.7 billion years ago
      • ancestral eukaryotes appeared
      • endosymbiosis of mitochondria and chloroplasts?
    • 2.2 billion years ago
      • Great Oxidation Event, sharp increase of atmospheric oxygen
    • 2.1 billion years ago
      • First undisputed fossil evidence of cyanobacteria, and of photosynthesis
  • 1.7 billion years ago
    - evolution of multicellular eukaryotes

    • 1.3 billion years ago
      • evidence for evolution of land fungi
    • 540 million years ago
      • Cambrian explosion
    • 480 million years ago
      • Devonian explosion
      • land plants appear
    • 550,000 years ago

    • 480,000 years ago

    • 200,000 years ago
      • first appearance of Homo sapiens
  • Describe the dominant physical and chemical characteristics of Earth systems at the following waypoints:

    • Hadean
      • molten Earth due to extreme volcanisms and frequent collisions
      • atmosphere consist of high water vapor, high carbon dioxide, nitrogen
      • early oceans produced by water vapour
      • moon-formation formed rock atmosphere
      • had a range of atmospheric temperatures, from 100C CO2-greenhouuse to glacial Ice-Hades with intervals of warm atmosphere after major impacts
    • Archean
      • temperatures similar to modern day due to young faint sun hypothesis
      • has reducing atmosphere: H2O, CH4, H2 and NH3
      • intense UV without ozone
      • biochemicals that are the prerequisites for the origin of life present in Late Hadean
    • Precambrian
      • glacial Earth known as Snowball Earth
      • atmosphere hypothesized to composed primarily of nitrogen, CH2 and other inert gases
      • some oxygen present but not in significant amounts
    • Proterozoic
      • accumulation of oxygen in atmosphere
    • Phanerozoic
      • abundant animal and plant life
      • normal amount of atmospheric oxygen comparable to today

Problem set_02 Microbial Enginess

Falkowski et al 1998

Learning objectives:

Discuss the role of microbial diversity and formation of coupled metabolism in driving global biogeochemical cycles.

Specific Questions:

  • What are the primary geophysical and biogeochemical processes that create and sustain conditions for life on Earth? How do abiotic versus biotic processes vary with respect to matter and energy transformation and how are they interconnected?
    • Geophysical processes are tectonics and atmospheric photochemical processes continuously supply substrates and remove products to create geochemical cycles. Majority of abioitic geochemical reactions based on acid/base chemistry. Biogeochemical reactions are based on redox reactions. The biological fluxes of H, C, N, O, S are largely catalyzed by microbes with redox reactions.
    • The six major elements of life are H, C, N, O, S and P. The biological fluxes of H, C, N, O, S are largely catalyzed by microbes with redox reactions, while geological supply of C, S and P is dependent on tectonics, such as volcanism and rock weathering. Abiotic processes are mostly driven by acid/base reactions and biotic processes by redox reactions. In particular for Earth, the biological oxidation is driven by photosynthesis.
  • Why is Earth’s redox state considered an emergent property?
    • The biotic and abiotic processes altered the surface redox state of the planets. It is the feedbacks between the metabolic and geological processes that create the average redox condition of the oceans and atmosphere.
  • How do reversible electron transfer reactions give rise to element and nutrient cycles at different ecological scales? What strategies do microbes use to overcome thermodynamic barriers to reversible electron flow?
    • On a community scale such as methanogens, the redox process requires cooperation of multiple species. For example, methane is reduced from CO2 and H2 by methanogenic Archaea. Hydrogen-consuming sulfate reducers present in vicinity will reduce [hydrogen] and causes the reverse process to become thermodynamically favorable. The methane can then be oxidized by other species to release H2. Whereas in a microbiological scale inside the cell, some cycles can be reversible, like the TCA. Some archaea can use TCA to both oxidize organic carbon into CO2 to release energy and also assimilate CO2 into organic matter by using energy. The energy produced from oxidation process can feed into the reductive process.
  • Using information provided in the text, describe how the nitrogen cycle partitions between different redox niches and microbial groups. Is there a relationship between the nitrogen cycle and climate change?

  • What is the relationship between microbial diversity and metabolic diversity and how does this relate to the discovery of new protein families from microbial community genomes?
    • Usually metabolic pathways evolved to use substrates that are the end products of other types of microbial metabolism, suchc as heterotrophs using the organic carbon byproducts of photosynthesis. Early cellular evolution included horizontal gene transfer as the principle mode of evolution. The genes responsible for major metabolic processes or entire metabolic pathways have probably been distributedin a common gene pool before further differentiation of species. Afterwards, nutritional and bioenergetic selective pressures drive the retention of these horizontally transferred genes.
  • On what basis do the authors consider microbes the guardians of metabolism?
    • Microbes can maintain genes for metabolism through HGT. Even if one community becomes extinct, the wide-spread distribution of these genes across many communities and niches will allow the metabolic pathways to be preserved.

Evidence Worksheet_03 The Anthropocene

Rockstrom et al 2009

Learning objectives:

• Evaluate human impacts on the ecology and biogeochemistry of Earth systems.

General questions

  • What were the main questions being asked?
    • What are the key variables, or planetary systems, that are negatively affected by human operation with respect to the Earth system and are associated with the planet’s biophysical subsystems or processes?
    • How can we define the parameters and threshold of each “planetary boundaries” that act as guidelines for safe operating space for humanity?
    • How far has humanity pushed each of the planetary boundaries and what are the consequences?
  • What were the primary methodological approaches used?
    • Planetary boundaries are determined by values for control variables that are either a “safe” distance from thresholds by looking at evidence of threshold behavior of certain processes, or at dangerous levels for processes that do not have a well-defined threshold. The large uncertainties surrounding the true threshold is taken into consideration. A safe distance is determined by involving normative judgements of how societies choose to deal with risk and uncertainty.
  • Summarize the main results or findings.
    • The nine planetary boundaries defined were: atmospheric aerosol loading, chemical pollution, climate change, ocean acidification, stratospheric ozone depletion, nitrogen and phosphorus cycle, global freshwater use, change in land use, biodiversity loss.
    • Three of the Earth-system processes have already transgressed their boundaries, and their continued deterioration will significantly impact the resilience of major components of Earth-system functioning.They are:
      • Climate change
      • Biodiversity loss
      • Interference with the nitrogen cycle
    • Climate change: the current CO2 concentration has transgressed past the boundary due to underestimation of the long-term effects of greenhouse gases and may lead to long-term irreversible climate change such as loss of major ice sheets, accelerated sea level rise and shifts in biological systems
    • Rate of biodiversity loss: Anthropocene has accelerated the rate of extinction of species to 100-1000 times more than the natural process, mainly due to loss of habitats from changes in land use and speed of climate change
    • Nitrogen and phosphorus cycles: additional input of nitrogen and phosphorus from human large-scale production has begun to significantly disturb their global cycles
  • Do new questions arise from the results?
    • The parameters and boundary of biodiversity, human modification of the nitrogen cycle is vague, more research is required to pin down this boundary with greater certainty
    • More understanding of the essential Earth processes and human actions are needed in order to push global change research and sustainability science
    • Exploration into the complex dynamic interactions and self-regulation of living systems is also required to better appreciate thresholds and shifts between states. This will help us realize the severity of environmental conditions.
  • Were there any specific challenges or advantages in understanding the paper (e.g. did the authors provide sufficient background information to understand experimental logic, were methods explained adequately, were any specific assumptions made, were conclusions justified based on the evidence, were the figures or tables useful and easy to understand)?
    • The paper is written in a way that is easy to understand, and the figures are well represented and explained in relation to the analysis Although the methods of defining the boundaries and threshold were explained inadequately for biodiversity and nitrogen cycle, the authors did emphasize the uncertainty in the definitions of boundaries used in the paper. Yes, there were assumptions made when the authors speculate the long-term consequences of transgressing the threshold of the boundaries. I believe the conclusions are largely justified as the paper used quantitative evidence of how has human actions altered the Earth’s natural cycles.

Writing assessment_01

Microbial life can easily live without us; we, however, cannot survive without the global catalysis and environmental transformations it provides." Do you agree or disagree with this statement? Answer the question using specific reference to your reading, discussions and content from evidence worksheets and problem sets

For billions of years, microbial engines drove the establishment of Earth’s geochemical cycles and coevolved with ancestors that ultimately lead to the evolution of Homo sapiens. Humans have rapidly expanded industrialization in the past hundreds of years and sharply shifted from an orthodox philosophy of restoring nature to its original form towards engineering a novel environmental landscape (1). Hence, does the future of our planet lie solely in the hands of humans, and microbes are merely the remnant of the past? On one hand, microbes are an essential part of Earth’s biogeochemical cycles of the past present and the future. On the other hand, humans are currently on route to take on an interventionalist approach in attempt to micromanage planetary balance through geological, atmospheric, genetic and biological engineering in place of microbes. In the face of technological advancements, humans will have more reliance on fossil fuels that have greatly increased detrimental long-term global climate change. Ultimately, humans must rely on the metabolic functions of microbes in the biogeochemical cycle to ensure future survival of our species.

Theme 1: Mechanisms and outcomes of microorganisms in shaping the early Earth
Mechanisms employed by micro-organisms to self-sustain and in shaping the early Earth
Microbes have thrived for more than half of the 4.5 billion years of Earth’s history and allowed the evolution of metabolic processes that would alter the planet’s geochemical framework while ensuring its own survival. It is important to open a discussion on the history of microbial life and the mechanisms they employed to alter the planet in order to understand on the influence microorganisms have on the environment and modern life. The earliest evidence of microbial activity was present at least 3.5 billion years ago (1), derived from stable-isotope fractionation that suggested sulfate reduction and methanogenesis (2). At the time, microbial organisms had already begun to influence the environment through thermodynamically constrained redox reactions that linked various systems of geochemical cycles, by working together with the acid/base reactions of planetary geochemical processes (2). For example, oxygenation of the atmosphere was a result of imbalances between the marine biotic and geochemical cycles through the sedimentation of organic matter in the marine organic carbon cycle “leakages” (4). If the cyclic-transfer of electrons in a redox reaction is a microcosm of metabolism in a single cell, the synergistic cooperation of multi-microbial-species assemblages drives larger-scale metabolism in an ecological community to ensure self-sustainability. The nitrogen cycle would be an example where the environment is influenced by species using the by products excreted by other species as substrates to generate energy for growth and reproduction (2). Moreover, these core metabolic genes have been widely-dispersed and maintained with horizontal gene transfer (HGT) between various microbial communities (2).

Microbial influence on altering Earth’s atmosphere
Microbial metabolism had paved way for evolution and survival of humans on Earth. The primitive atmosphere of early Earth mainly consisted of methane and carbon dioxide from a mix of biotic and abiotic processes such as methanogenic bacteria and release of mantle gases (3, 4). Firstly, cyanobacteria have established the current oxygenic atmosphere to human existence. These microbes have generated the rapid rise of atmospheric oxygen as by-products, leading to the Great Oxidation Event 2.45 Gya that had caused the mass extinction of anaerobic niches, and inadvertently paved way for the establishment of aerobic microorganisms (3). The advent of aerobic respiration provided a source of power that would have increase energy productivity, which permitted the development of complex eukaryotic life (3). Fast forward to 1.5 Mya, cyanobacteria helped eukaryotic cells harness photosynthetic metabolism through endosymbiosis as suggested from ribosomal RNA analysis, and created a new niche of photosynthetic eukaryotes that would ultimately become primary producers supporting the majority in the hierarchy of life. Secondly, feedback between the biotic and abiotic processes maintain the balance of the six major elements to sustain life, hydrogen (H), carbon (C), nitrogen (N), oxygen (O), sulfur (S) and phosphorus (P). The biological organisms cycle these elements through various chemical forms, and the abiotic geological processes degrade and regenerate available forms of these elements to input into the biotic cycle (2).

Theme 2: Humans are currently uncapable of maintaining biogeochemical balance

Industrial replacement of microbial processes lead to disruptions of biogeochemical cycle
The confluence of human population boom and technological leaps have triggered an irreversible era of a new Anthropocene landscape that requires constant micro-management and sometimes reengineering to maintain ecological balance, one that we have a difficult time keeping up. The initialization and the continuation of industrial revolution in the recent 200 years have represented a major step in how we humans influence the environment and consume Earth’s materials. Humans are able to define specific operational spaces in the geochemical cycles and are capable of replacing microbial metabolism in the defined niche on an industrial scale, such as the Haber-Bosch process for nitrogen fixation is twice as fast as the natural rate of fixation in the nitrogen cycle (5). However, the result of disproportional input of element has not only impacted its own cycle but also virtually all natural geochemical cycles. The balance in these cycles largely controlled and maintained by microbes in concert with geological processes have been disrupted. The distortion of basic biogeochemical cycles across the globe is a result of decoupling of intimate interactions between the natural cycles (6). For example, the production ammonium is no longer solely dependent on the oxygen cycle, but rather more so on industrial synthesis. Reduction in the interdependence between natural cycles will eventually require constant human intervention to manage biogeochemical balance and fix problems caused by previous interventions (6). As understanding of all pathways of microbial responses to environmental changes is complicated by the largely unexplored diversity of microbial community and still a long way to go, humans are limited as interventionalists in dictating our planet.
Challenges in maintaining complex technological systems
Assuming humans have eventually achieved a complete understanding of the microbial diversity on the planetary scale to replace microbes as Earth’s guardians, challenges still exist in managing the complex technological systems to be established for interventionalism. Recent examples of the BP oil spill that happened despite elaborate monitoring systems and lines of defense to prevent oil-well blowout, as well as failure of the Fukushima nuclear plant to withstand natural disasters, have shown complex technological systems are still vulnerable to rare but consequential failures (7). On the other hand, microbes have peacefully harnessed a natural feedback process that has driven out-of-balanced environment to new steady states for billions of years, it further questions whether people can intelligently manage something as complicated as the natural world.

Theme 3: Microbial metabolisms are needed for human survival
Microbes are irreplaceable decomposers

The reliance on fossil fuels and industrialized forms of agriculture have increased the risk of irreversible long-term climate change, which is also a driver in declining biodiversity and altered large-scale geochemical cycles (5). High input of elements from human sources leads to large waste accumulation. Microorganisms in general perform the initial biomass decomposition, and certain communities contribute unique and essential roles of completing the task, especially under anoxic conditions (8). Would higher microbial life forms be able to decompose waste products and return the elements into the natural cycle sufficiently? Without them, biomass would likely begin to accumulate starting from the molecular level, eventually creating vast reservoirs of un-transformable biogeochemical waste that cannot be harnessed by primary producers and consumers that ultimately sustain the rest of the global food chain (8). For example, the disappearance of the non-renewable element phosphorus would impact marine primary production. Anthropogenic input would be a challenge to rescue the system given the decrease in phosphorous mines (8). As the establishment of global-scale anthropogenic decomposition system is unlikely in the recent future, microbes are the staple to ensure planetary and human sustainability.

Harnessing microbial metabolism as renewable energy source
In addition to the current sustaining function of microbes in biogeochemical processes, their metabolic potential can also be harnessed towards protective interventions. Microbe engineering has the potential to produce of renewable biofuel to supplement the increasing demand for energy and eventually replace burning of fossil fuels. Although the concept is the opposite of the natural distribution of core metabolic genes through HGT, research has begun to investigate the potential to arm microorganisms with human-designed genetic framework for large-scale metabolization of cellulose into ethanol, a common biofuel (9).

Conclusion
Microbes have played a vital part in providing an appropriate environment required for human evolution and continue to be integral for human survival. Not only have microbes lived long before the existence of our species, their drive for self-sustainability have lead to the establishment of Earth’s current atmosphere, allowing it to be suitable for humans to thrive. However, humans’ lack of understanding of the unintended consequences of our current actions have lead to imbalances in the biogeochemical cycle. We may ultimately approach an era where humans are in managerial roles of the planet. However, the missing information in the complete picture of microbial metabolic pathways is likely to continue result in failures of next-generation technological systems developed for interventionalism. In this sense, humans must still reply on microbial metabolic pathways necessary to complete biogeochemical cycles, especially microbes as decomposers of the planet. With more understanding of microbial metabolism, the pathways may be used for renewable sources of energy, and has the potential to save Earth from the destruction created by our ignorance.

Writing Assignment 01 references

  1. Nisbet, EG, Sleep, NH. 2001. The habitat and nature of early life. Nature. 409:1083-1091. doi: 10.1038/35059210.
  2. Falkowski, PG, Fenchel,T, Delong, EF. 2008. The Microbial Engines That Drive Earth’s Biogeochemical Cycles. Science. 320:1034-1039.
  3. Sessions, AL, Doughty, DM, Welander, PV, Summons, RE, Newman, DK. 2009. The Continuing Puzzle of the Great Oxidation Event. Current Biology. 19:R574.
  4. Kasting, J, Siefert, J. 2003. Life and the evolution of Earth’s atmosphere (vol 296, pg 1066, 2002). Science. 299:1015-1015.
  5. Rockström, J, Steffen, W, Noone, K, Scheffer, M. 2009. A safe operating space for humanity. Nature. 461:472-475.
  6. Falkowski, PG. 2015. Life’s Engines: How Microbes Made Earth Habitable. Princeton University Press, Princeton, NJ, U.S.A.
  7. Achenbach, J. 2012. Spaceship Earth: A new view of environmentalism. The Washington Post.
  8. Gilbert, JA, Neufeld, JD. 2014. Life in a World without Microbes. PLoS Biol. 12:12.
  9. Burdass, D, Hurst, J. 2008. Microbes and Climate Change. Society for General Microbiology.

Module 01 references

Whitman WB, Coleman DC, and Wiebe WJ. 1998. Prokaryotes: The unseen majority. Proc Natl Acad Sci USA. 95(12):6578-6583. PMC33863
Nisbet EG and Sleep NH. 2001. The habitat and nature of early life. Nature. 409(6823):1083-1091. PMID11234022

Falkowski PG, Fenchel T, and Delong EF. 2008. The microbial engines that drive Earth’s biogeochemical cycles. Science. 320(5879):1034-1039. PMID18497287

Rockstrom J, Steffen W, Noone K, Persson A, Stuart Chapin II F, Lambin EF, Lenton TM …Foley JA. 2009. A safe operating spce for humanity. Nature. 461:47-475. doi:10.1038/461472a

Module 02

Evidence worksheet_04 Bacterial Rhodopsin Gene Expression

Martinez et al 2007

Learning objectives

Discuss the relationship between microbial community structure and metabolic diversity.
Evaluate common methods for studying the diversity of microbial communities.
Recognize basic design elements in metagenomic workflows

General questions

  • What were the main questions being asked?

    • The minimum number of genes needed to generate a fully functional photorhodopsin (PR) system, and what are these genes?
    • How did the PR system become so ubiquitous among diverse microbial taxa?
    • Characterize each gene product in the photosystem biosynthetic pathway and the specific functionality of the PR system
  • What were the primary methodological approaches used?
    • Screening Fosmid (large-insert DNA) libraries derived from marine picoplankton for visibly detectable PR-expressing phenotypes in vivo
    • Sequencing of the PR photosystem-containing clones
  • Summarize the main results or findings.
    • The PR photosystem containing clones have highest identity to Alphaproteobacteria.
    • The clones contained six-gene operon encoding putative enzymes for beta-carotene and retinal biosynthesis. These 6 genes alone can enable light-activated photophorylation in a heterologous host.
    • The PR system is speculated to spread throughout different microbial lineages by single lateral transfer events, because it has importance in cellular bioenergetics, it is simple and compact and has plasticity that enables it to persevere in diverse phylogenetic groups
    • PRs appear to function as light-activated ion pumps that positively contributes to cellular energy metabolism
  • Do new questions arise from the results?
    • Further strengthen the established link between PR and light-induced growth stimulation in Flavobacteria, as there are studies suggesting otherwise.
  • Were there any specific challenges or advantages in understanding the paper (e.g. did the authors provide sufficient background information to understand experimental logic, were methods explained adequately, were any specific assumptions made, were conclusions justified based on the evidence, were the figures or tables useful and easy to understand)?
    • I wished for explanation and background could be provided for the methods section on why each specific technique was chosen for the experiment. Because of a lack of background information, it is difficult to establish the assumptions that underlie the chosen genetic analysis techniques.

Problem set_03 “Metagenomics: Genomic Analysis of Microbial Communities”

Learning objectives:

Specific emphasis should be placed on the process used to find the answer. Be as comprehensive as possible e.g. provide URLs for web sources, literature citations, etc.
(Reminders for how to format links, etc in RMarkdown are in the RMarkdown Cheat Sheets)

Specific Questions:

  • How many prokaryotic divisions have been described and how many have no cultured representatives (microbial dark matter)?

    • As of 2016, there has been 89 bacterial phyla, 20 archael phyla described by small 16s rRNA databases. However, this could be up to 1500 bacterial phyla as there are microbes that live in the “shadow biosphere”, which is a microbial biosphere containing microbes that employs metabolic processes that are radically different than currently known life.
    • In 2003, studies have described 26 out of 52 major bacterial phyla have been cultivated. With advancing understanding of microbial diversity and metabolic processes, as well as better developed culturing technologies, there are likely more previously “unculturable” prokaryotic divisions cultured.
  • How many metagenome sequencing projects are currently available in the public domain and what types of environments are they sourced from?
    • According to EBI database alone, there are 110217 sequencing projects in the public domain. Undoubtedly, more projects exist on different databases. There are endless possibilities of where sequencing projects can be sourced. For example, most are from the gut, soil, sediments and aquatic environments. These environments are popular for sequencing because the inhabitants are hard to culture in a lab setting.
  • What types of on-line resources are available for warehousing and/or analyzing environmental sequence information (provide names, URLS and applications)?

    • Shot gun metagenomics:
      • Assembly – EULER, ING
      • Binning -S-GCOM
      • Annotation – KEGG
      • Analysis pipelines – Megan 5 (need to BLAST sequence afterwards)
      • Data warehousing (includes BLAS-type database already) – ING, MG-RAGT, NLBI
    • Marker Gene Metagenomics
      • Standalone software – OTUbase
      • Analysis pipelines – SILVA
      • Denoising – Amplicon Noise
      • Databases – Ribosomal Database Project (RDP)
  • What is the difference between phylogenetic and functional gene anchors and how can they be used in metagenome analysis?

Phylogenetic Functional
Vertical gene transfer horizontal gene transfer
Carry phylogenetic information allowing tree reconstruction identify specific biogeochemical functions associated with measurable effects
taxonomic not as useful for phylogeny
ideally single-copy
  • What is metagenomic sequence binning? What types of algorithmic approaches are used to produce sequence bins? What are some risks and opportunities associated with using sequence bins for metabolic reconstruction of uncultivated microorganisms?

    • In a community of sequences, the algorithm of sequences will identity the sequences and cluster them based on what it thinks as the same group, assigning them to OTUs.The purpose is to reconstruct the genome of an organism/group of organism from segments. Bin is all of the variation in a population of sequence of the same genome from a community.
    • There are two types of algorithms:
      • Based on sequence alignment to databases
      • Based on organism-specific characteristic: GC content, codon usage
    • Metrics of the good bin includes
      • percentage completeness of sequence
      • percentage of contamination
    • The risks can include
      • Incomplete coverage of genomes sequence
      • Contamination from different phylogeny
  • Is there an alternative to metagenomic shotgun sequencing that can be used to access the metabolic potential of uncultivated microorganisms? What are some risks and opportunities associated with this alternative?
    • Functional screens
      • Biochemical
      • Make big libraries and probe for phenotypes of interest
    • Single cell sequencing: Amplify genome from a single cell

Module 02 references

Martinez A, Bradley AS, Waldbauer JR, Summons RE, BeLong EF. 2007. Proteorhodopsin photosystem gene expression enables photophosphorylation in a heterologous host. Proc Natl Acad Sci U.S.A. 104(13):5590-5595. PMC1838496

Module 03

Problem set_04 “Fine-scale phylogenetic architecture”

Learning objectives:

  • Gain experience estimating diversity within a hypothetical microbial community

Part 1: Description and enumeration

Obtain a collection of “microbial” cells from “seawater”. The cells were concentrated from different depth intervals by a marine microbiologist travelling along the Line-P transect in the northeast subarctic Pacific Ocean off the coast of Vancouver Island British Columbia.

Sort out and identify different microbial “species” based on shared properties or traits. Record your data in this Rmarkdown using the example data as a guide.

Once you have defined your binning criteria, separate the cells using the sampling bags provided. These operational taxonomic units (OTUs) will be considered separate “species”. This problem set is based on content available at What is Biodiversity.

For example, load in the packages you will use.

#To make tables
library(kableExtra)
library(knitr)
#To manipulate and plot data
library(tidyverse)

Then load in the data. You should use a similar format to record your community data. Finally, use these data to create a table.

For your sample:

  • Construct a table listing each species, its distinguishing characteristics, the name you have given it, and the number of occurrences of the species in the collection.

*Data has been collected in excel to output a tsv file

CandyCommunity = read.table (file="Candy Seawater.txt", header=TRUE, row.names=1, sep="\t", na.strings=c("NAN","NA","."))
CandySample_V2 = read.table (file="Candy Sample.txt", header=TRUE, row.names=1, sep="\t", na.strings=c("NAN","NA","."))
  • Ask yourself if your collection of microbial cells from seawater represents the actual diversity of microorganisms inhabiting waters along the Line-P transect. Were the majority of different species sampled or were many missed?
  • It’s difficult to conclude whether my collection represents the actual diversity without any further analysis. As collection methods are unknown, it is uncertain if there was any sampling bias, or if there were sufficient number of samples taken from different areas of the seawater.

Part 2: Collector’s curve

To help answer the questions raised in Part 1, you will conduct a simple but informative analysis that is a standard practice in biodiversity surveys. This analysis involves constructing a collector’s curve that plots the cumulative number of species observed along the y-axis and the cumulative number of individuals classified along the x-axis. This curve is an increasing function with a slope that will decrease as more individuals are classified and as fewer species remain to be identified. If sampling stops while the curve is still rapidly increasing then this indicates that sampling is incomplete and many species remain undetected. Alternatively, if the slope of the curve reaches zero (flattens out), sampling is likely more than adequate.

To construct the curve for your samples, choose a cell within the collection at random. This will be your first data point, such that X = 1 and Y = 1. Next, move consistently in any direction to a new cell and record whether it is different from the first. In this step X = 2, but Y may remain 1 or change to 2 if the individual represents a new species. Repeat this process until you have proceeded through all cells in your collection.

Create a plot. We will use a scatterplot (geom_point) to plot the raw data and then add a smoother to see the overall trend of the data.

For your sample:

  • Create a collector’s curve.

*Data has been collected in excel to output a tsv file

Collectors_Curve_V2 = read.table (file="Collectors Curve.txt", header=TRUE, row.names=1, sep="\t", na.strings=c("NAN","NA","."))
ggplot(Collectors_Curve_V2, aes(x=x, y=y)) +
  geom_point() +
  geom_smooth() +
  labs(x="Cumulative number of individuals classified", y="Cumulative number of species observed")
## `geom_smooth()` using method = 'loess'

  • Does the curve flatten out? If so, after how many individual cells have been collected?
  • Yes , the curve has flattened out after 38 samples have been collected

  • What can you conclude from the shape of your collector’s curve as to your depth of sampling?
  • The collector’s curve is an increasing function with a slope that decreases as more individuals are classiied and less species remain to be identified. The curve is appraoching zero slope at the top, meaning that few to no new species remain identified. Sampling is likely to be adequate.

Part 3: Diversity estimates (alpha diversity)

Using the table from Part 1, calculate species diversity using the following indices or metrics.

Diversity: Simpson Reciprocal Index

\(\frac{1}{D}\) where \(D = \sum p_i^2\)

\(p_i\) = the fractional abundance of the \(i^{th}\) species

The higher the value is, the greater the diversity. The maximum value is the number of species in the sample, which occurs when all species contain an equal number of individuals. Because the index reflects the number of species present (richness) and the relative proportions of each species with a community (evenness), this metric is a diveristy metric. Consider that a community can have the same number of species (equal richness) but manifest a skewed distribution in the proportion of each species (unequal evenness), which would result in different diveristy values.

  • What is the Simpson Reciprocal Index for your sample?

  • The Simpson Reciprocal Index calculated in excel.
  • Community sample: 23.17449389
  • Individual Sample: 24.96018735

Richness: Chao1 richness estimator

Another way to calculate diversity is to estimate the number of species that are present in a sample based on the empirical data to give an upper boundary of the richness of a sample. Here, we use the Chao1 richness estimator.

\(S_{chao1} = S_{obs} + \frac{a^2}{2b})\)

\(S_{obs}\) = total number of species observed a = species observed once b = species observed twice or more

So for our previous example community of 3 species with 2, 4, and 1 individuals each, \(S_{chao1}\) =

3 + 1^2/(2*2)
## [1] 3.25
  • What is the chao1 estimate for your sample?

Community sample:

51 + (13^2/(2*38))
## [1] 53.22368

Individual sample:

39 + (13^2/(2*26))
## [1] 42.25

Part 4: Alpha-diversity functions in R

We’ve been doing the above calculations by hand, which is a very good exercise to aid in understanding the math behind these estimates. Not surprisingly, these same calculations can be done with R functions. Since we just have a species table, we will use the vegan package. You will need to install this package if you have not done so previously.

library(vegan)

First, we must remove the unnecesary data columns and transpose the data so that vegan reads it as a species table with species as columns and rows as samples (of which you only have 1).

Then we can calculate the Simpson Reciprocal Index using the diversity function.

And we can calculate the Chao1 richness estimator (and others by default) with the the specpool function for extrapolated species richness. This function rounds to the nearest whole number so the value will be slightly different that what you’ve calculated above.

In Project 1, you will also see functions for calculating alpha-diversity in the phyloseq package since we will be working with data in that form.

For your sample:

  • What is the Simpson Reciprocal Index using the R function?

For community sample:

CandyCommunity_diversity = 
     CandyCommunity %>% 
     select(name, occurences) %>% 
     spread(name, occurences)
diversity(CandyCommunity_diversity, index="invsimpson")
## [1] 23.17449

For individual sample:

CandySample_diversity_V2 = 
     CandySample_V2 %>% 
     select(name, occurences) %>% 
     spread(name, occurences)
diversity(CandySample_diversity_V2, index="invsimpson")
## [1] 24.96019
  • What is the chao1 estimate using the R function?

For Community Sample:

specpool(CandyCommunity_diversity)
##     Species chao chao.se jack1 jack1.se jack2 boot boot.se n
## All      51   51       0    51        0    51   51       0 1

For Individual Sample:

specpool(CandySample_diversity_V2)
##     Species chao chao.se jack1 jack1.se jack2 boot boot.se n
## All      39   39       0    39        0    39   39       0 1

*Verify that these values match your previous calculations.
The Simpson Reciprocal Index have mathcing numbers for both individual and community samples with manhual calculation and R function. However, the Chao1 values do not match.

Part 5: Concluding activity

If you are stuck on some of these final questions, reading the Kunin et al. 2010 and Lundin et al. 2012 papers may provide helpful insights.

  • How does the measure of diversity depend on the definition of species in your samples?
    • depending on the definition of species in my sample, the measure of diversity will change based on how I grouped the different organisms (pieces of candies) according to their characteristics. The measure of diversity will decrease if there are more organisms grouped in the same specie taxa, resulting in less types of organisms in the community sample. As a result, the alphadiversity and Chao1 value will both decrease.
  • Can you think of alternative ways to cluster or bin your data that might change the observed number of species?
    • the observed number of species will also changed based on what I define as “a single species”. For example, the twizzlers are long and filamentus. We defined a clump of it as one species while others may define each string as a species.
  • How might different sequencing technologies influence observed diversity in a sample?

    • The parameter for pipielines during sequencing. For example, the sequencing platform, sample prep (consistent prrocessing of sample to get their DNA), and if the pipelines looks at the same gene region (ex. within 16s RNA) to identify different individuals.

Evidence Worksheet_05 “Extensive mosaic structure”

Welch et al 2002

Part 1: Learning objectives

Evaluate the concept of microbial species based on environmental surveys and cultivation studies.
Explain the relationship between microdiversity, genomic diversity and metabolic potential.
Comment on the forces mediating divergence and cohesion in natural microbial communities.

General questions

  • What were the main questions being asked?
    • To understand the genetic bases for pathogenicity and the evolutionary diversity of E. coli by using a model that compares genomic differences in the uropathogenic CTF073 strain, the enterohemorrhagic strain EDL933 and the non-pathogenic lab strain MG1655
    • Where did the pathogenicity islands of CTF073 come from?
    • How the different strains in the same species differ in the genetic level and how they relate to the strain’s phenotype
    • How the process of gene transfer changes the ecotype through the acquisition of the gene islands
    • Relationship between micro diversity, genomic diversity and metabolic potential
  • What were the primary methodological approaches used?
    • Automated Sanger Sequencing (dye-terminator chemistry) across 3700 machines on CFT073
    • Searched predicted proteins against blast
  • Summarize the main results or findings.
    • More than 70% of unique ORFs to the MG1655 or EDL933 have been replaced with new genes specific to CFT073
    • Among the 204/2004 shared gene with EDL933, some encode for iron-uptake systems, fatty acid biosynthetic enzymes, adhesins, phosphotransferase system and ABC-transport systems.
    • The difference in disease potential between CFT073 and EDL933 are the absence of typeIII secretion system, phage- and plasmid- encoded virulence genes that are common in E. coli O157:H7
    • There is selective pressure on the expression of piluses that vary among the E. coli lineages
    • Genes in the backbone have been acquired through vertical transmission and have relatively low changes, genes acquired by HGT separate lineages.
    • Three strains have 39.2% similarity in their proteins
    • Island tend to encode for adaptive traits that helps the strain select for fitness in different environment, leading to ecotype and pathogenesis
  • Do new questions arise from the results?
    • Are there such big differences between other pathogenic and non-pathogenic species.
    • What is a reasonable way to classify species? (if the same species only have 39.2% similarity in their genome)
  • Were there any specific challenges or advantages in understanding the paper (e.g. did the authors provide sufficient background information to understand experimental logic, were methods explained adequately, were any specific assumptions made, were conclusions justified based on the evidence, were the figures or tables useful and easy to understand)?
    • Short methods section, not enough explanations on how they used the sequencing techniques and technologies, and how they come to the conclusion of the different pathogenicity islands/genome differences

Part 2: Learning objectives

Comment on the creative tension between gene loss, duplication and acquisition as it relates to microbial genome evolution.
Identify common molecular signatures used to infer genomic identity and cohesion.
Differentiate between mobile elements and different modes of gene transfer.

Specific question

  • Based on your reading and discussion notes, explain the meaning and content of the following figure derived from the comparative genomic analysis of three E. coli genomes by Welch et al. Remember that CFT073 is a uropathogenic strain and that EDL933 is an enterohemorrhagic strain. Explain how this study relates to your understanding of ecotype diversity. Provide a definition of ecotype in the context of the human body. Explain why certain subsets of genes in CFT073 provide adaptive traits under your ecological model and speculate on their mode of vertical descent or gene transfer.
    • Ecotype – distinct strains that occupy a habitat
    • In the human context – different strains of E. Coli can occupy different habitats in the human body because they have different niches. For example: the uropathogenic strain in the urinary tract, enterohemorrhagic strain in the intestines.
    • They have gene islands that help them specifically adapt to different niches in the body
    • Islands shared between the two strains could’ve come from:
      • Horizontal gene transfer
      • From common ancestors where the two diverged from (vertical gene transfer)

Module 03 references

Welch RA, Burland V, Plunkett II G, Redford P, Roesche P, Rasko D, Buckles EL…Blattner FR. 2002. Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli. Proc Natl Acad Sci USA. 99(26):17020-4 PMC139262